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Catalog Extraction in SZ Cluster Surveys: a matched filter 

approach 
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Abstract. We present a method based on matched multifrequency filters for extracting cluster catalogs from 
Sunyaev-Zel'dovich (SZ) surveys. We evaluate its performance in terms of completeness, contamination rate and 
photometric recovery for three representative types of SZ survey: a high resolution single frequency radio survey 
(AMI), a high resolution ground-based multiband survey (SPT), and the Planck all-sky survey. These surveys are 
not purely flux limited, and they loose completeness significantly before their point-source detection thresholds. 
Contamination remains relatively low at < 5% (less than 30%) for a detection threshold set at S/N=5 (S/N=3). 
We identify photometric recovery as an important source of catalog uncertainty: dispersion in recovered flux 
from multiband surveys is larger than the intrinsic scatter in the Y — M relation predicted from hydrodynamical 
simulations, while photometry in the single frequency survey is seriously compromised by confusion with primary 
cosmic microwave background anisotropy. The latter effect implies that follow-up observations in other wavebands 
(e.g., 90 GHz, X-ray) of single frequency surveys will be required. Cluster morphology can cause a bias in the 
recovered Y — M relation, but has little effect on the scatter; the bias would be removed during calibration of the 
relation. Point source confusion only slightly decreases multiband survey completeness; single frequency survey 
completeness could be significantly reduced by radio point source confusion, but this remains highly uncertain 
because we do not know the radio counts at the relevant flux levels. 

Key words. 



1. Introduction 

Galaxy cluster catalogs play an important role in cosmol- 
ogy by furnishing unique information on the matter dis- 
tribution and its evolution. Cluster catalogs, for example, 
efficiently trace large-scale features, such as the recently 
detected baryon oscillations IjEisenstein et al. 2 005 
ICole et al. 20051 |Angulo et al. 20051 IHuetsi 2005|l . 
and provide a sensitive gauge of structure growth 
back to high redshifts IjOukbir fe Blanchard 1 992 
|Rosati, Borgani fc Norman 2002| IVoit 20041 and refer- 
ences therein). This motivates a number of ambitious 
projects proposing to use large, deep catalogs to constrain 
both galaxy evolution models and the cosmological 
parameters, most notably the dark energy abundance 
and equation-of-state (|Haiman, Mohr fe Holder 2000| 
IWeller fc Battye 2003| |Wang et al. 2004| ). Among the 
most promising are surveys based on the Sunyaev- 
Zel'dovich (SZ) effect flSunyaev fc Zeldovich 1970| 



Sunyaev fc Zeldovich 1972| and see IBirkinshaw 19991 
C^rl^troTnTl^oldeTTriReese 2002| for reviews), because it 
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does not suffer from surface brightness dimming and be- 
cause we expect the observed SZ signal to tightly correlate 
to cluster mass IfBartlett 20011 IMotl et al. 2005) . Many 
authors have investigated the scientific potential of SZ 
surveys to constrain cosmology (e.g., Barbo sa"et al. 1 996 

IColafrancesco et al. 19971 IHolder et al. 20001 

IKneissl et al. 200ll IBenson et al. 2002jl , emphasizing 
the advantages intrinsic to observing the SZ signal. 



Cosmological studies demand statistically pure cat- 
alogs with well understood selection criteria. As just 
said, SZ surveys are intrinsically good in this light; how- 
ever, many other factors - related, for example, to in- 
strumental properties, observing conditions, astrophysi- 
cal foregrounds and data reduction algorithms - influ- 
ence the selection criteria. This has prompted some au- 
thors to begin more careful scrutiny of SZ survey se- 
lection functions in anticipation of future observations 
IjBartlett 2001IISchulz fc White 20031 IWhite 20031 Vale & 
White 127)1751 Melin et al. 127)051 IJmn et al. 2005|l . 
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In Melin et al. 1|2005|) . we presented a general formal- 
ism for the SZ selection function together with some pre- 
liminary applications using a matched-filter cluster de- 
tection method. In this paper we give a thorough pre- 
sentation of our cluster detection method and evaluate 
its performance in terms of catalog completeness, con- 
tamination and photometric recovery. We focus on three 
types of SZ survey: single frequency radio surveys like the 
Arcminute MicroKelvin Imager (AMI interferometer) sur- 
vey 1 , multi-band ground-based bolometric surveys such 
as the South Pole Telescope (SPT) survey 2 , and the space- 
based Planck survey 3 . In each case, we quantify the selec- 
tion function using the formalism of Melin et al. {2005). 

We draw particular attention to the oft-neglected is- 
sue of photometry. Even if the SZ flux-mass relation is 
intrinsically tight, what matters in practice is the relation 
between the observed SZ flux and the mass. Photometric 
errors introduce both bias and additional scatter in the ob- 
served relation. Calibration of the Y — M relation will in 
principal remove the bias; calibration precision, however, 
depends crucially on the scatter in the observed relation. 
Good photometry is therefore very important. As we will 
see, observational uncertainty dominates the predicted in- 
trinsic scatter in this relation in all cases studied. 

We proceed as follows. In section 2, we discuss cluster 
detection techniques and present the matched filter for- 
malism. We describe our detection algorithm in Section 3. 
Using Monte Carlo simulations of the three types of sur- 
vey, we discuss catalog completeness, contamination and 
photometry. This is done in Section 4 under the ideal sit- 
uation where the filter perfectly matches the simulated 
clusters and in the absence of point sources. In Section 
5 we examine effects caused by cluster morphology, using 
N-body simulations, and then the effect of point sources. 
We close with a final discussion and conclusions in Section 
6. 



2. Detecting Clusters 

The detection and photometry of extended sources 
presents a complexity well appreciated in Astronomy. 
Many powerful algorithms, such as S Extractor 
l|Bertin & Arnouts 1996|l . have been developed to 
extract extended sources superimposed on an unwanted 
background. They typically estimate the local back- 
ground level and group pixels brighter than this level into 
individual objects. Searching for clusters at millimeter 
wavelengths poses a particular challenge to this approach, 
because the clusters are embedded in the highly vari- 
able background of the primary CMB anisotropics and 
Galactic emission. Realizing the importance of this issue, 
several authors have proposed specialized techniques for 
SZ cluster detection. Before detailing our own method, 

1 http : //www.mrao . cam. ac .uk/telescopes/ami/ 

2 http://astro.uchicago.edu/spt/ 

3 http : //astro . estec . esa . nl/Planck/ 



we first briefly summarize some of this work in order to 
motivate our own approach and place it in context. 

2.1. Existing Algorithms 

Diego et al. ; 2002; developed a method designed for the 
Planck mission that is based on application of SExtractor 
to SZ signal maps constructed by combining different fre- 
quency channels. It makes no assumption about the fre- 
quency dependance of the different astrophysical signals, 
nor the cluster SZ emission profile. The method, however, 
requires many low-noise maps over a broad range of fre- 
quencies in order to construct the SZ map to be processed 
by SExtractor. Although they will benefit from higher res- 
olution, planned ground-based surveys will have fewer fre- 
quencies and higher noise levels, making application of this 
method difficult. 

In another approach, Herranz et al. ( 2002a, 2002b see 
also |L6pez-Caniego et al. 2005| for point-source applica- 
tions) developed an ingenious filter (Scale Adaptive Filter) 
that simultaneously extracts cluster size and flux. Defined 
as the optimal filter for a map containing a single cluster, 
it does not account for source blending. Cluster-cluster 
blending could be an important source of confusion in 
future ground-based experiments, with as a consequence 
poorly estimated source size and flux. 

Hobson & McLachlan 1|2003|) recently proposed a pow- 
erful Bayesian detection method using a Monte Carlo 
Markov Chain. The method simultaneously solves for the 
position, size, flux and morphology of clusters in a given 
map. Its complexity and run-time, however, rapidly in- 
crease with the number of sources. 

More recently, Schafer et al. (2004) generalized scale 
adaptive and matched filters to the sphere for the Planck 
all-sky SZ survey. Pierpaoli et al. (J2D04 ) propose a method 
based on wavelet filtering, studying clusters with complex 
shapes. Vale & White (2005) examine cluster detection 
using different filters (matched, wavelets, mexican hat), 
comparing completeness and contamination levels. 

Finally, Pires et al. (2005) introduced an independent 
component analysis on simulated multi-band data to sep- 
arate the SZ signal, followed by non linear wavelet filter- 
ing and application of SExtractor. 

Our aim is here is two-fold: to present and extensively 
evaluate our own SZ cluster catalog extraction method, 
and to use it in a comprehensive study of SZ survey selec- 
tion effects. The two are in fact inseparable. First of all, 
selection effects are specific to a particular catalog extrac- 
tion method. Secondly, we require a robust, rapid algo- 
rithm that we can run over a large number of simulated 
data sets in order to accurately quantify the selection ef- 
fets. This important consideration conditions the kind of 
extraction algorithm that we can use. With this in mind, 
we have developed a fast catalog construction algorithm 
based on matched filters for both single and multiple fre- 
quency surveys. It is based on the approach first proposed 
by Herranz et al., but accounts for source blending. 
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Fig. 1. Two examples of the matched filter for 9 C = 1 arcmin. The curves give the radial profiles of the filters, which 
are symmetric because we have chosen a symmetric cluster template. Left: filter for a single frequency survey with 
a $fwhm = 1-5 arcmin beam and 8 /iK instrumental noise/beam (AMI-like, see Table The undulating form of 
the filter maximizes the cluster signal while reducing contamination from primary CMB anisotropy. Right: The three 
components of the 3-band filter for a SPT-like experiment (Table ^) . The filter is arbitrarily normalized to unity at 
f 50 GHz. The filter uses both spatial and frequency weighting to optimally extract the cluster signal from the CMB 
and instrument noise. Although in this Figure the filters continue to large radii, in practice we truncate them at I00 c . 



After describing the method, we apply the formalism 
given in Melin et al. ( 2005) to quantify the selection func- 
tion and contamination level in up-coming SZ surveys. 
We take as representative survey configurations AMI, SPT 
and Planck, and Monte Carlo simulate the entire catalog 
extraction process from a large ensemble of realizations for 
each configuration. By comparing to the simulated input 
catalogs, we evaluate the extracted catalogs in terms of 
their completeness, contamination and photometric accu- 
racy/precision. We will place particular emphasis on the 
importance of the latter, something which has received 
little attention in most studies of this kind. 

2.2. Matched Filters 

The SZ effect is caused by the hot gas (T ~ 1 - 10 keV) 
contained in galaxy clusters known as the intracluster 
medium (1CM) ; electrons in this gas up-scatter CMB pho- 
tons and create a unique spectral distortion that is nega- 
tive at radio wavelengths and positive in the submillime- 
ter, with a zero-crossing near 220 GHz. The form of this 
distortion is universal (in the non-relativistic limit appli- 
cable to most clusters), while the amplitude is given by 
the Compton y parameter, an integral of the gas pres- 
sure along the line-of-sight. In a SZ survey, clusters will 
appear as sources extended over arcminute scales (apart 
from the very nearby objects, which are already known) 
with brightness profile 

M v {x) = y{x)j v (1) 

relative to the mean CMB brightness. Here y(x) is the 
Compton y parameter at position x (a 2D vector on the 
sky) and j v is the SZ spectral function evaluated at the 
observation frequency v. 



Matched filters for SZ observations were first proposed 
by Haehnelt & Tegmark (J1996J as a tool to estimate clus- 
ter peculiar velocities from the kinetic effect, and Herranz 
et al. (2002a, 2002b) later showed how to use them to 
detect clusters via the thermal SZ effect. They are de- 
signed to maximally enhance the signal-to-noise for a SZ 
cluster source by optimally (in the least square sense) fil- 
tering the data, which in our case is a sky map or set of 
maps at different frequencies. They do so by incorporat- 
ing prior knowledge of the cluster signal, such as its spa- 
tial and spectral characteristics. The unique and universal 
frequency spectrum of the thermal SZ effect (in the non- 
relativistic regime) is hence well suited for a matched-filter 
approach. 

Less clear is the choice of the spatial profile Tg c (x) to 
adopt for cluster SZ emission. One aims to choose a spatial 
template that represents as well as possible the average 
SZ emission profile. In other words, we want Tg c (x) = 
(y(x)/y )c, where the average is over many clusters of 
size 9 C . In the following, we choose to describe clusters 
with a projected spherical /3-profile: 

y{x)=y {l + \x\Velr^-^ (2) 

with (3 = 2/3 (with one exception, shown for comparison 
in Figure^ . The spatial template is therefore described by 
a single parameter, the core radius # c ; in our calculations, 
we truncate the profile at 109 c . This is a reasonable choice, 
given X-ray observations IjArnaud 2005J1 of the intraclus- 
ter medium and the resolution of planned SZ surveys. 

In reality, of course, we know neither this average 
profile precisely nor the dispersion of individual clusters 
around it beforehand. This is an important point, because 
our choice for the template will affect both the detection 
efficiency and photometric accuracy. Detection efficiency 
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will be reduced if the template does not well represent the 
average profile and, as will become clear below, the pho- 
tometry will be biased. In general, the survey selection 
function unavoidably suffers from uncertainty induced by 
unknown source astrophysics (in addition to other sources 
of uncertainty) . 

In the following, we first study (Section 4) the ideal 
case where the filters perfectly match the cluster profiles, 
i.e., we use the /3-model for both our simulations and as 
the detection template. In a later section (5), we examine 
the effects caused by non trivial cluster morphology, as 
well as by point source confusion. 

Consider a cluster with core radius 9 C and central y- 
value y D positioned at an arbitrary point x on the sky. For 
generality, suppose that the region is covered by several 
maps Mi(x) at N different frequencies vi (i — l,...,N). 
We arrange the survey maps into a column vector M(x) 
whose i th component is the the map at frequency Vi\ this 
vector reduces to a scalar map in the case of a single fre- 
quency survey. Our maps contain the cluster SZ signal 
plus noise: 



M{x) = y o j v T 0c {x - x ) + N(x) 



(3) 



where TV is the noise vector (whose components are noise 
maps at the different observation frequencies) and j v is 
a vector with components given by the SZ spectral func- 
tion j v evaluated at each frequency. Noise in this con- 
text refers to both instrumental noise as well as all signals 
other than the cluster thermal SZ effect; it thus also com- 
prises astrophysical foregrounds, for example, the primary 
CMB anisotropy, diffuse Galactic emission and extragalac- 
tic point sources. 

We now build a filter \I>0 o (x) (in general, a column 
vector in frequency space) that returns an estimate, y Q , of 
y a when centered on the cluster: 



y a = J d 2 x *e c *(a; - x Q ) • Mix) 



(4) 



where superscript t indicates a transpose (with complex 
conjugation when necessary). This is just a linear com- 
bination of the maps, each convolved with its frequency- 
specific filter (\£e c )». We require an unbiased estimate of 
the central y value, so that (y a ) — y a , where the aver- 
age here is over both total noise and cluster (of core ra- 
dius 8 C ) ensembles. Building the filter with the known SZ 
spectral form and adopted spatial template optimizes the 
signal-to-noise of the estimate; in other words, the fil- 
ter is matched to the prior information. The filter is now 
uniquely specified by demanding a minimum variance es- 
timate. The result expressed in Fourier space (the flat sky 
approximation is reasonable on cluster angular scales) is 
(Haehnelt & Tegmark ITM)1 Hcrranz et al. I2002al Melin 
et al. 12003: 



* ec (fc) = 4p- 1 (fc)-F ec (fc) 
where 

F 6c {k) = JMk) 



(5) 



(6) 




0.1 1.0 
C cluster [arcmin] 

Fig. 2. Filter noise expressed in terms of integrated SZ 
flux Y — ay = o~g c J Tg a (x) dx - as a function of template 
core radius 9 C for the three experiments listed in Tabled 
A cluster with Y = cry would be detected at a signal-to- 
noise ratio q = 1. At a fixed detection threshold q (e.g., 
3 or 5), the completeness of a survey rapidly increases 
from zero to unity in the region above its corresponding 
curve qayi&c) (Melin et al. 2005?). All the curves adopt our 
fiducial value of /3 = 2/3, except the dashed-triple-dotted 
red curve, shown for comparison, which corresponds to the 
SPT case with (3 = 0.6; this curve is systematically higher 
by (2.5 to 13)%, depending on 6 C . 



0~9 C 



d 2 k FgJ(k) ■ P 1 ■ Fg c {k) 



1-1/2 



(7) 



with P(k) being the noise power spectrum, a matrix 
in frequency space with components P,j defined by 
(Ni(k)Nf(k')) N = Pij(k)5(k-k'). The quantity a 9c gives 
the total noise variance through the filter. When we speak 
of the signal-to-noise of a detection, we refer to y /o~g c . 

We write the noise power spectrum as a sum JJy = 
P? oiBe S i:j + Bi(k)B?(k)P** y , where P™ isc represents the 
instrumental noise power in band i, B(k) the observa- 
tional beam and P^ y gives the foreground power (non- 
SZ signal) between channels i and j. As explicitly writ- 
ten, we assume uncorrelated instrumental noise between 
observation frequencies. Note that we treat the astrophysi- 
cal foregrounds as isotropic, stationary random fields with 
zero mean. The zero mode is, in any case, removed from 
each of the maps, and the model certainly applies to the 
primary CMB anisotropy. It should also be a reasonable 
model for fluctuations of other foregrounds about their 
mean, at least over cluster scales 4 . 

Two examples of the matched filter for 9 C = 1 arcmin 
are shown in Fig. ^ one for an AMI-like single frequency 
survey with a 1.5 arcmin beam (left-hand panel) and the 
other for a SPT-like 3-band filter (right-hand panel); see 



4 We make no assumption about the Gaussianity of the fields; 
the estimator remains unbiased even if they are not Gaussian, 
although optimality must be redefined in this case. 
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Tablc^for the experimental characteristics. The filters are 
circularly symmetric, with the figures giving their radial 
profiles, because we have chosen a spherical cluster model. 
We clearly see the spatial weighting used by the single 
frequency filter to optimally extract the cluster from the 
noise and CMB backgrounds. The multiple frequency filter 
1 4'e c is a 3-element column vector containing filters for 
each individual frequency. In this case, the filter employs 
both spectral and spatial weighting to optimally extract 
the cluster signal. 

Figure [21 shows the filter noise as a function of tem- 
plate core radius 9 C . We plot the filter noise expressed in 
terms of an equivalent noise cry = crg c / Tg o (x) dx on the 
integrated SZ flux Y. The dashed-triple-dotted red curve 
with [3 — 0.6 is shown for comparison to gauge the impact 
of changing this parameter, otherwise fixed at j3 = 2/3 
throughout this work. Melin et al. ( 2005 ) use the informa- 
tion in this figure to construct survey completeness func- 
tions. At fixed signal-to-noise q, the completeness of a 
survey rapidly increases to unity in the region above the 
curve qay. The Figure shows that high angular resolution 
ground-based surveys (e.g., AMI, SPT) are not purely 
flux limited, because their noise level rises significantly 
with core radius. The lower resolution of the Planck sur- 
vey, on the other hand, results in more nearly flux limited 
sample. 

3. Catalog Extraction 

Catalog construction proceeds in three steps, the last two 
of which are repeated 5 : 

1. Convolution of the frequency map(s) with matched fil- 
ters corresponding to different cluster sizes; 

2. Identification of candidate clusters as objects with 
signal-to-noise y Q /<Jg c > q, where q is our fixed detec- 
tion threshold, followed by photometry of the bright- 
est remaining cluster candidate, which is then added 
to the final cluster catalog; 

3. Removal of this object from the set of filtered maps us- 
ing the photometric parameters (e.g., y and 9 C ) from 
the previous step. 

We loop over the last two steps until there are no remain- 
ing candidates above the detection threshold. The follow- 
ing sections detail each step. 

3.1. Map filtering 

In the first step, we convolve the observed map(s) 
with matched filters covering the expected range of 

5 Note that we have made some changes in the two last steps 
compared to the description given in Melin et al. (120051 . We no 
longer sort candidates in a tree structure for de-blending; in- 
stead, we identify and then remove candidates one by one from 
the filtered maps. This has only a small impact on the com- 
pleteness of the detection algorithm, leaving the conclusions 
of our previous paper intact. The changes, however, greatly 
improve photometry and lower contamination. 



core radii. For AMI and SPT, for example, we vary 
9 C from 0.1 to 3arcmins in 0.1 steps (i.e., 9 C = 
0.1, 0.2, 2.9, 3 arcmins) and add three values for the 
largest clusters: 4, 5, 6 arcmins. We thus filter the map(s) 
ng c times (ng c = 33 for AMI and SPT) to obtain 2 ng c 
filtered maps, Jg c et Lg c . The ng c maps Jg c give the SZ 
amplitude (obtained using \&0 C ), while the ng a maps Lg c 
give the signal-to-noise ratio: Lg c — Jg c /<jgJ. We set a 
detection threshold at fixed signal-to-noise q and identify 
candidates at each filter scale 9 C as pixels with Lg c > q. 
Common values for the threshold are q = 3 and q = 5; the 
choice is a tradeoff between detection and contamination 
rates (see below). 

3.2. Cluster parameter estimation: Photometry 

We begin the second step by looking for the brightest can- 
didate pixel in the set of maps Lg c . The candidate cluster 
is assigned the spatial coordinates (x, y) of this pixel, and 
its core radius is defined as the filter scale of the map 
containing the pixel: 9 C — 9f. We then calculate the to- 
tal integrated flux using Y = y J Tg c (x) dx, where y is 
taken from the map Jg c at the same filter scale. We refer 
to this step as the photometric step, and the parameters 
y Q , 9 C and Y as photometric parameters. Note that mea- 
surement error on Y comes from errors on both y Q and 9 C 
(We return to this in greater detail in Section ^2}- 

3.3. Catalog construction 

The candidate cluster is now added to the final cluster 
catalog, and we proceed by removing it from the set of 
filtered maps Jg c and hg c before returning to step 2. To 
this end, we construct beforehand a 2D array (library) of 
un-normalized, filtered cluster templates (postage-stamp 
maps) 

T 8etBt (x) = J d 2 x' ^g f {x' - x)T 9c (x') (8) 

with the cluster centered in the map. Note that 9 C runs 
over core radius and 9{ over filter scale. At each filter scale 
9{ , we place the normalized template y 1~g c ^g t on the cluster 
position (x, y) and subtract it from the map. The library 
of filtered templates allows us to perform this step rapidly. 

We then return to step 2 and repeat the process un- 
til there are no remaining candidate pixels. Thus, clusters 
are added to the catalog while being subtracted from the 
maps one at a time, thereby de-blending the sources. By 
pulling off the brightest clusters first, we aim to mini- 
mize uncertainty in the catalog photometric parameters. 
Nevertheless, it must be emphasized that the entire pro- 
cedure relies heavily on the use of templates and that real 
clusters need not conform to the chosen profiles. We return 
to the effects of cluster morphology below. 

In the end, we have a cluster catalog with positions 
(x, y), central Compton y parameters, sizes 9 C and fluxes 
Y. 
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Type 


Frequencies 


Res. fwhm 


Inst, noise 


Area 




[GHz] 


[arcmin] 


[/iK/beam] 


[deg 2 ] 


AMI 


15 


1.5 


8 


10 


SPT 


150 


1 


10 






220 


0.7 


60 


4000 




275 


0.6 


100 




Planck 


143 


7.1 


6 






217 


5 


13 


41253 




353 


5 


40 





Table 1. Characteristics of the three types of experiments 
considered. We run our extraction method on 100 sky 
patches of 3 x 3 square degrees (for AMI and SPT) and 
12 x 12 square degrees (for Planck). 



4. Cluster recovery 

We tested our catalog construction method on simulated 
observations of the three representative types of SZ sur- 
vey specified in Tabled The simulations include SZ emis- 
sion, primary CMB anisotropy and instrumental noise and 
beam smearing. We do not include diffuse Galactic fore- 
grounds in this study. We begin in this section with the 
ideal case where the filter perfectly matches the simulated 
clusters (spherical /3-model profiles) and in the absence of 
extragalactic point sources. We return to the additional 
effects of cluster morphology and point source confusion 
in Section 5. 

The simulated maps are generated by Monte Carlo. We 
first create a realization of the linear matter distribution 
in a large box using the matter power spectrum. Clusters 
are then distributed according to their expected number 
density, given by the mass function, and bias as a function 
of mass and redshift. We also give each cluster a peculiar 
velocity consistent with the matter distribution accord- 
ing to linear theory. The simulations thus featuring clus- 
ter spatial and velocity correlations accurate first order, 
which is a reasonable approximation on cluster scales. In 
this paper, we use these simulations but we do not study 
the impact of the correlations on the detection method, 
leaving this issue to forthcoming work. 

The cluster gas is modeled by a spherical isothermal 
/3-profile with f3 = 2/3 and C /9 V = 0.1, where 9 V is the an- 
gular projection of the virial radius and which varies with 
cluster mass and redshift following a self-similar relation- 
ship. We choose an M — T relation consistent with the lo- 
cal abundance of X-ray clusters and our value of as, given 
below (Pierpaoli et al. l2004J) . Finally, we fix the gas mass 
fraction at / gas = 0.12 fe.g.. lMohr et al. 1999|) . The input 
catalog consists of clusters with total mass M > 10 14 M Q , 
which is sufficient given the experimental characteristics 
listed in Table ^ Delabrouille et al. (2002) describe the 
simulation method in more detail. 

We generate primary CMB anisotropies us- 
the power spectrum calculated by CMBFAST 6 



with n M = 0.3 = 1 - n A dSpergel et al. 2003|), Hu bble 
constant H Q = 70 km/s/Mpc IjFreedman et al. 200 1(1 and 
a power spectrum normalization og — 0.98. As a last 
step we smooth the map with a Gaussian beam and add 
Gaussian white noise to model instrumental effects 7 . 

We simulate maps that would be obtained from the 
proposed surveys listed in Table ^ The first is an 
AMI 8 -like experiment tl Jones et al. 2002)l . a single fre- 
quency, high resolution interferometer; the sensitivity 
corresponds to a one-month integration time per 0.1 
square degree HKneissl et al. 20010 . The SPTMike exper- 
iment IIRuhl et al. 2004|) is a high resolution, multi-band 
bolometer array. We calculate the noise levels assuming an 
integration time of 1 hour per square degree, and a split 
of 2/3, 1/6, 1/6 of the 150, 220, 275 G Hz channels for th e 
1000 detectors in the focal plane array IjRuhl et al. 20 04). 
Finally, we consider the space-based Planck 10 -like exper- 
iment, with a nominal sensitivity for a 14 month mission. 
For the AMI and SPT maps we use pixels 11 of 30arcsec, 
while for Planck the pixels are 2.5 arcmin. 

We simulate 100 sky patches of 3 x 3 square degrees 
for both AMI and SPT, and of 12 x 12 square degrees 
for Planck. This is appropriate given the masses of de- 
tected clusters in each experiment. In practice, AMI will 
cover a few square degrees, similar to the simulated patch, 
while SPT will cover 4000 square degrees and Planck will 
observe the entire sky. Thus, the surveys decrease in sen- 
sitivity while increasing sky coverage from top to bottom 
in Table El (see also Table [TJ. 



deg" 2 


S/N > 3 


S/N > 5 


AMI 


44 


20 




(38) 


(16) 


SPT 


35 


12 




(27) 


(11) 


Planck 


1.00 


0.38 




(0.84) 


(0.35) 



Table 2. Extracted counts/sq. deg. from simulations of 
the three types of survey. The numbers in parenthesis give 
the counts predicted by our analytic cluster model; the 
difference is due to cluster overlap confusion (see text). 



mg 



( |Seljak fc Zaldarriaga 1996| > for a flat concordance model 



http : / /cmbf ast . org/ 



7 The 3-year WMAP results, published after the work pre- 
sented here was finished, favor a significantly lower value of 
<rg flSpergel et al. 2006| > . This could lower the total number of 
clusters in our simulations by up to a factor of ~ 2. As we 
are interested here in catalog recovery, where we compare out- 
put to input catalogs, this change should only cause relatively 
minor changes to our final results. 

8 http : //www.mrao . cam. ac .uk/tele scopes/ami/ index .html 

9 http://astro.uchicago.edu/spt/ 

10 http : //www.rssd. esa. int/index .php?project=PLANCK 

11 Pixel sizes are at least 2 times smaller than the best channel 
of each experiment. 



Melin et al.: Catalog Extraction in SZ Cluster Surveys 



7 




1Q -5 1Q -4 1Q -3 1Q -2 

Y true [arcmin 2 ] 

Fig. 3. Cluster counts N(> Y) per square degree as a 
function of true SZ flux Y for a threshold of S/N > 5. The 
dash-dotted black line gives the cluster counts from the 
mass function l|,Ienkins et al. 200"Tjl . The dashed blue line 
gives the recovered cluster counts for AMI, the red solid 
line for SPT and the dotted green line for Planck. The 
inset shows the completeness ratio (relative to the mass 
function prediction) for each survey. All the surveys are 
significantly incomplete at their point-source sensitivities 
(5 times the y-intercept in Figure |2J). 

4.1. Association criteria 

An important issue for catalog evaluation is the associa- 
tion between a detected object (candidate cluster) with a 
cluster from the simulation input catalog (real cluster); in 
other words, a candidate corresponds to which, if any, real 
cluster. Any association method will be imprecise, and es- 
timates of catalog completeness, contamination and pho- 
tometric accuracy will unavoidably depend on the choice 
of association criteria. 

We proceed as follows: for each detection, we look at 
all input clusters with centers positioned within a distance 
r = v^8 x d, where d is the pixel size (d = 30 arcsec for 
AMI and SPT, d = 2.5 arcmin for Planck); this covers the 
neighboring 24 pixels. If there is no input cluster, then we 
have a false detection; otherwise, we identify the candi- 
date with the cluster whose flux is closest to that of the 
detection. After running through all the candidates in this 
fashion, we may find that different candidates are associ- 
ated with the same input cluster. In this case, we only 
keep the candidate whose flux is closest to the common 
input cluster, and we flag the other candidates as false 
detections (multiple detections). 

At this stage, some associations may nevertheless be 
chance alignments. We therefore employ a second param- 
eter, F cu t: a candidate associated with a real cluster of 
flux Y < Y cut is flagged as a false detection. We indicate 
these false detections as diamonds in Figures [TJ [HI EI an( i 
1111 The idea is that such clusters are too faint to have 
been detected and the association is therefore by chance. 
In the following, we take Y" C ut = 1.5 X arcmin 2 for 



AMI and SPT, respectively, and Y cut = 3 x 1CP 4 arcmin 2 
for Planck. Note that these numbers are well below the 
point-source sensitivity (at S/N=5) in each case (see be- 
low and Figure [2Jl • 

4.2. Completeness 

Figure [21 shows completeness for the three experiments 
in terms of true integrated Y, while Table [21 summarizes 
the counts. In Figure0]we give the corresponding limiting 
mass as a function of redshift. Given our cluster model, 
AMI, SPT and Planck should find, respectively, about 16, 
11 and 0.35 clusters/deg. 2 at a S/N > 5; these are the 
numbers given in parentheses in Table [3 Cluster overlap 
confusion accounts for the fact that the actual counts ex- 
tracted from the simulated surveys are higher: some clus- 
ters that would not otherwise pass the detection cut enter 
the catalog because the filter adds in flux from close neigh- 
bors. 

A detection threshold of S/N — 5 corresponds 
to a point-source sensitivity of just below Y = 5 x 
10~ 5 arcmin 2 for both AMI and SPT, as can be read off 
the left-hand-side of Figure [21 The surveys approach a 
high level of completeness only at Y > 10~ 4 arcmin 2 , 
however, due to the rise of the selection cut with core ra- 
dius seen in Figure[21 For these high resolution surveys, 
point-source sensitivity gives a false idea of the survey 
completeness flux limit. 

At the same signal-to-noise threshold, Planck is essen- 
tially complete above Y ~ 10~ 3 arcmin 2 and should detect 
about 0.4 clusters per square degree. Since most clusters 
are unresolved by Planck, the survey reaches a high com- 
pleteness level near the point-source sensitivity. We also 
see this from the small slope of the Planck selection cut 
in Figure [21 

We emphasize that the surveys (in particular, the 
high resolution surveys) are not flux limited for any 
value of q, because increasing q simply translates the 
curve in Figure [21 along the y axis. However, one can 
approach a flux-limited catalog by selecting clusters at 
S/N > q and then cutting the resulting catalog at 
Y > ilimit = Qcy(6c = 0.1 arcmin), where the constant 
Q > q. As Q increases we tend towards a catalog for which 
Y ~ Y > liimit- In the case of SPT with q — 3, for ex- 
ample, we find that large values of Q (> 10) are required 
to approach a reasonable flux-limited catalog; this con- 
struction, however, throws away a very large number of 
detected clusters. 

Although the AMI (single frequency) and SPT (multi- 
band) survey maps have comparable depth, SPT will cover 
~ 4000 sq. degrees, compared to AMI'S ~ 10 sq. degrees. 
Planck will only find the brightest clusters, but with full 
sky coverage. Predictions for the counts suffer from clus- 
ter modeling uncertainties, but the comparison between 
experiments is robust and of primary interest here. 
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Fig. 4. Mininum detectable cluster mass as a function of 
redshift, M(z), corresponding to S/N — 5 for the three 
experiments discussed in the text. The rise at low redshift 
for the single-frequency (AMI) curve is caused by confu- 
sion with primary CMB anisotropy. 




!>? 20 



I 15: 
o 
c 

E 10 : 




1Q -5 1Q -4 1Q -J 1Q -2 1Q -1 

Y recovered [arcmin 2 ] 

Fig. 5. Contamination as a function of the core radius 9 C 
for the three experiments and for S/N > 5. 



4.3. Contamination 

Figure [S] shows the contamination level at S/N > 5 for 
each survey type as a function of recovered flux Y Q . The 
multiband experiments (SPT and Planck) benefit from 
low contamination at all fluxes. Single frequency surveys 
(e.g., AMI), on the other hand, experience a slightly higher 
contamination level at large flux due to confusion from 
primary CMB anisotropy. This confusion also degrades 
the photometry, as we discuss below. 

At S/N > 5, the AMI, SPT and Planck catalogs have 
less than 2% total contamination rate. These numbers in- 
crease to ~ 23, 20 and 27 percent, respectively, for AMI, 
SPT and Planck at a detection threshold of S/N > 3. 
Note that the total contamination rate is an average over 
the histogram of Figure 03 weighted by the number of ob- 



Fig. 6. Completeness-Purity plot. For each curve, q varies 
from 3 (top- left) to 10 (bottom-right). For each experi- 
ment, the input catalog contains clusters with true flux 
greater than three times the point source sensitivity 
(Ytrue > 2.2 x 10 -5 arcmin 2 for AMI, Ytrue > 2.6 x 
10~ 5 arcmin 2 for SPT and Ytrue > 4.8 x 10~ 4 arcmin 2 
for Planck). See text for details. 




10~ 5 10~ 4 10~ 3 10~ 2 10~ 1 
Y true [arcmin 2 ] 

Fig. 7. Recovered vs. true flux for SPT clusters extracted 
at S/N > 5 from 100 survey simulations. The diamonds 
indicate cluster detections with Y < Y cut , which we take 
as false detections. The mean trend Y Q (Y) has a slight 
bias (see text) and a roughly constant scatter of <Ji gY — 
0.17 over the interval in true Y from 10~ 4 arcmin 2 to 
4 x 10~ 3 arcmin 2 . The clusters which have their core radii 
overestimated by a factor of 2 are plotted as red crosses 
and the clusters which have their core radii underesti- 
mated by a factor of 2 are plotted as blue triangles. 

jects in each bin; thus, the higher contamination at large 
flux is down-weighted in the total rate. 

In all cases, the contamination rate is higher than ex- 
pected from pure Gaussian noise fluctuations; there is 
an important contribution from cluster-cluster confusion 
(residuals from cluster subtraction and overlaps). We ex- 
pect even higher contamination rates in practice, because 
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of variations in cluster morphology around the filter tem- 
plates. We quantify this latter effect below. 

A useful summary of these results is a completeness- 
purity plot, as shown in Figure [H] Proper comparison of 
the different experiments requires an appropriate choice of 
input catalog used to define the completeness in this plot. 
Here, we take the input catalog as all clusters with (true) 
flux geater than three times the point source sensitivity for 
each experiment. If the clusters were point sources and the 
detection method perfect (i.e. not affected by confusion), 
the completeness would be I for q = 3 in the top-left cor- 
ner. These curves summarize the efficiency of our cluster 
detection method; however, they give no information on 
the photometric capabilities of the experiments. 
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4.4. Photometry 

We now turn to the important, but often neglected is- 
sue of cluster SZ photometry. The ability of a SZ sur- 
vey to constrain cosmology relies on application of the 

Y — M relation. As mentioned, we expect the intrin- 
sic (or true) flux to tightly correlate with cluster mass 
IjBartlett 2001(1 . as indeed borne out by numerical simula- 
tions l|da Silva et al. 20041 IMbtl et al. 20051 |Nagai 2005| ). 
Nevertheless, unknown cluster physics could affect the 
exact form and normalization of the relation, pointing 
up the necessity of an empirical calibration (referred to 
as survey calibration), either with the survey data it- 
self (self-calibrat i on, IBu 20031 [Majumdar fc Mohr 2003| 
ILima &: Hu 20041 ILima fc Hu 2005fl or using external 
data, such as lensing mass estimates (Bartclmann 2001) 
(although the latter will be limited to relatively low red- 
shifts). 

Photometric measurement accuracy and precision is as 
important as cluster physics in this context: what matters 
in practice is the relation between recovered SZ flux Y a 
and cluster mass M. Biased SZ photometry (bias in the 

Y — Y Q ) relation will change the form and normalization 
of the Y Q — M relation and noise will increase the scatter. 
One potentially important source of photometric error for 
the matched filter comes from cluster morphology, i.e., the 
fact that cluster profiles do not exactly follow the filter 
shape (see Section 5). 

Survey calibration will help remove the bias, but with 
an ease that depends on the photometric scatter: large 
scatter will increase calibration uncertainty and/or neces- 
sitate a larger amount of external data. In addition, scat- 
ter will degrade the final cosmological constraints (e.g., 
ILima fc Hu 2005). Photometry should therefore be con- 
sidered an important evaluation criteria for cluster catalog 
extraction methods. 

Consider, first, SPT photometry. Figure shows the 
relation between observed (or recovered) flux Y Q and true 
flux Y for a detection threshold of S/N > 5. Fitting for 
the average trend of Y a as a function of Y, we obtain 

logy o = 0.961ogF- 0.15 



Fig. 8. Recovered vs. true flux for Planck clusters ex- 
tracted at S/N > 5 from 100 survey simulations. The di- 
amonds indicate cluster detections with Y < Y cut , which 
we take as false detections. The mean trend Y (Y) has 
a slight bias (see text) and a roughly constant scat- 
ter of o~i og Y a = 0.13 over the interval in true Y from 
2 x 10~ 3 arcmin 2 to 2 x 10~ 2 arcmin 2 . The clusters which 
have their core radii overestimated by a factor of 2 are 
plotted as red crosses and the clusters which have their 
core radii underestimated by a factor of 2 are plotted as 
blue triangles. 

over the interval 10 -4 arcmin 2 < Y < 4 x 10~ 3 arcmin 2 , 
with Y Q and Y measured in arcmin 2 . There is a slight bias 
in that the fit deviates somewhat from the equality line, 
but the effect is minor. Below this flux interval, the fit curls 
upward in a form of Malmquist bias caused by the S/N 
cut (seen as the sharp lower edge on Y Q ). The lack of any 
significant bias is understandable in this ideal case where 
the filter perfectly matches the cluster SZ profile. Cluster 
morphology, by which we mean a mismatch between the 
cluster SZ profile and the matched filter template), can 
induce bias; we return to this issue in Section 5. 

The scatter about the fit is consistent with a Gaussian 
distribution with a roughly constant standard deviation 
of o~i og Y = 0.17 over the entire interval. 

The scatter is a factor of 10 larger than expected from 
instrumental noise alone, which is given by the selection 
curve in Figure El Uncertainty in the recovered cluster 
position, core radius and effects from cluster-cluster con- 
fusion all strongly influence the scatter. Photometry pre- 
cision, therefore, cannot be predicted from instrumental 
noise properties alone, but only with simulations account- 
ing for these other, more important effects. 

Figure |H1 shows the photometry for the Planck survey. 
Apart from some catastrophic cases (the diamonds), the 
photometry is good and fit by 

lo g r o = 0.981ogF - 0.07 

over the interval 2 x 10~ 3 arcmin 2 < Y < 2 x 10 -2 arcmin 2 
(Y a , Y measured in arcmin 2 ). The dispersion is oi og Y a — 
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0.13, roughly constant over the same interval. For unre- 
solved clusters, this scatter is ~ 5 times larger than the 
expected instrumental-induced scatter. The brightest di- 
amonds in the Figure correspond to real clusters with po- 
sitional error larger than the association criteria r. As a 
consequence, the candidates are falsely associated with a 
small, nearby cluster, unrelated to the actual detected ob- 
ject. 

We emphasize that the observational scatter in the 
Y — Y relation for both SPT and Planck dominates the 
intrinsic scatter of less than 5% seen in the Y — M re- 
lation from numerical simulations Ijda Silva et al. 2004 
IMotl et al. 2005|) . 

We now turn to single frequency surveys, which 
Figure shows to have seriously compromised photom- 
etry. The distribution at a given true flux Y is in fact 
bimodal, as illustrated by the solid blue histogram in 
Figure HUI that gives the distribution of the recovered flux 
Y for clusters with true flux and core radius in a bin cen- 
tered on Y = 1.5 x 10~ 4 arcmin 2 and 8 C = 0.3 arcmin. 
We have traced this effect to an inability to accurately 
determine the core radius of the candidate clusters. We 
demonstrate this in Figure ^2 by artificially setting the 
candidate core radius to its true value taken from the asso- 
ciated input cluster; the photometry now cleanly scatters 
about the mean trend. 

This inability to determine the core radius mainly 
arises from confusion with primary CMB anisotropy, as 
we now show using Figure ^] We performed 1000 simu- 
lations of a single cluster (Y — 1.5 x 10 -4 arcmin 2 , 8 C = 
0.3 arcmin) placed at the middle of a beam-convolved 
map containing background SZ clusters (from our general 
simulations), primary CMB anisotropy and instrumental 
noise. We then estimate its core radius and flux with our 
matched filters centered on the known position (to avoid 
any positional uncertainty) and trace the histogram of 
resulting measured flux. This is the red dot-dashed his- 
togram in the figure, which displays a bi-modality similar 
to that of the blue solid histogram. We then follow the 
same procedure after first removing the primary CMB 
anisotropy from the simulated map. The resulting his- 
togram of recovered flux is given by the black dot-dashed 
line with much less pronounced bimodality. The remaining 
tail reaching towards high flux is caused by cluster-cluster 
confusion. 

With their additional spectral information, multiband 
surveys remove the primary CMB signal, thereby avoiding 
this source of confusion. The result suggests that follow-up 
observations of detected clusters at a second frequency will 
be required for proper photometry; without such follow-up, 
the scientific power of a single frequency survey may be 
seriously compromised, as can be appreciated from inspec- 
tion of Figure 03 

5. Additional Effects 

As emphasized, our previous results follow for a filter that 
perfectly matches the (spherical) clusters in our simula- 
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Fig. 9. Recovered vs. true flux for AMI clusters extracted 
at S/N > 5 from 100 survey simulations. The diamonds 
indicate cluster detections with Y < Y cut , which we take 
as false detections. The extremely large dispersion in re- 
covered flux results from a bimodal distribution caused by 
an inability to determine the core radius of detected clus- 
ters. This inability is due to confusion from primary CMB 
anisotropy, as demonstrated in FigurelTUl Figure rH] shows 
that reasonable photometry is possible if the core radius 
can be accurately determined. This problem is specific to 
single-frequency surveys that are unable to spectrally re- 
move primary CMB anisotropy. 



tions and in the absence of any point sources. In this sec- 
tion we examine the effects of both cluster morphology 
and point sources. 

We find that cluster morphology has little effect on 
catalog completeness, but that it does increase contami- 
nation. More importantly, it can bias photometric recov- 
ery, although it does not significantly increase the scatter. 
This bias changes the observed Y — M relation from its in- 
trinsic form, adding to the modeling uncertainty already 
caused by cluster gas physics. For this reason, the rela- 
tion must be calibrated in order to use the SZ catalog for 
any cosmological study. The observational bias would be 
removed during this calibration step. 

Completeness is the most affected by point source con- 
fusion, decreasing somewhat for the multi-band surveys in 
the presence of IR point sources. The level of confusion for 
the single frequency survey remains highly uncertain due 
to the unknown point source counts at low flux densities. 
Contamination and photometry are essentially unaffected. 



5.1. Cluster Morphology 

To assess the influence of cluster morphology, we ran our 
catalog extraction algorithm on maps constructed from 
numerical simulations. We use the simulations presented 
by Schulz & White IjSchulz fc White 2003|) and kindly pro- 
vided to us by M. White. Their simulations follow dark 
matter clustering with a N-body code in a flat concor- 
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Fig. 10. The full blue histogram gives the cluster counts 
from figure in the bin (1(T 4 < Y < 2.1CT 4 , 0.25 < 6 C < 
0.35). We have added the cluster counts obtained from the 
size and flux estimation of a single cluster (Y = 1.5 x 10~ 4 , 
8 C = 0.3) at a known position through 1000 simulations. 
SZ cluster background maps and the instrumental beam 
and noise are included. Two cases are considered : with 
primary CMB (dotted red histogram) and without pri- 
mary CMB (dash-dotted black line). The double bump in 
Y recovery is visible when the primary CMB is present 
and disappears when it's removed showing that the pri- 
mary CMB power spectrum is the cause of the double 
bump. 




Y true [arcmin 2 ] 

Fig. 11. Single-frequency photometry when we artificially 
set the core radii of detected clusters to their true values 
from the input catalog. The dispersion decreases dramati- 
cally, demonstrating that the inability to recover the core 
radius is the origin of the bad photometry seen in Figure^ 

dance cosmology, and model cluster gas physics with semi- 
analytical techniques by distributing an isothermal gas of 
mass fraction 51b/^m according to the halo dark mat- 
ter distribution. For details, see Schulz & White. In the 
following, we refer to these simulations as the "N-body" 
simulations. 



Fig. 12. Photometry for the SPT catalog from the N- 
body simulations. Cluster morphology (mismatch between 
the filter profile and the actual cluster SZ profile) clearly 
induces a bias between the recovered and true SZ flux. 
The scatter, on the other hand, is not very affected, as 
can be seen in comparing with Figure [7| 



We proceed by comparing catalogs extracted from the 
N-body map to those from a corresponding simulation 
made with spherical clusters. The latter is constructed by 
applying our spherical /3-model gas distribution to the 
cluster halos taken from the N-body simulation and using 
them as input to our Monte Carlo sky maps. In the pro- 
cess, we renormalize our Y — M relation to the one used in 
the N-body SZ maps. We thus obtain two SZ maps con- 
taining the same cluster halos, one with spherical clusters 
(referred to hereafter as the "/3-model" maps) and the 
other with more complex cluster morphology (the N-body 
maps). Comparison of the catalogs extracted from the two 
different types of simulated map gives us an indication of 
the sensitivity of our method to cluster morphology. We 
make this comparative study only for the SPT and Planck 
like surveys. 

Catalog completeness is essentially unaffected by clus- 
ter morphology; the integrated counts, for example, follow 
the same curves shown in Figure [3] with very little devi- 
ation, the only difference being a very small decrease in 
the Planck counts at the lowest fluxes. The effect, for ex- 
ample, is smaller than that displayed in Figure EH due to 
point source confusion (and discussed below). 

Non-trivial cluster morphology, however, does signifi- 
cantly increase the catalog contamination rate; for exam- 
ple, in the SPT survey the global contamination rises from 
less than 2% to 13% at S/N = 5 for the N-body simula- 
tions. We trace this to residual flux left in the maps after 
cluster extraction: cluster SZ signal that deviates from 
the assumed spherical /3-model filter profile remains in 
the map and is picked up later as new cluster candidates. 
Masking those regions where a cluster has been previously 
extracted (i.e., forbidding any cluster detection) drops the 
contamination to 4% (SPT case), but causes a decrease 
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of 2.8 clusters per square degree in the recovered counts; 
this technique would also have important consequences for 
clustering studies. 

From FigureEl we clearly see that cluster morphology 
induces a bias in the photometry. This arises from the fact 
that the actual cluster SZ profiles differ from the template 
adopted for the filter. The differences are of two types: an 
overall difference in the form of radial profile and local 
deviations about the average radial profile due to cluster 
substructure. It is the former that is primarily responsi- 
ble for the bias. In our case, the N-body simulations have 
much more centrally peaked SZ emission than the filter 
templates, which causes the filter to systematically un- 
derestimate the total SZ flux. Cluster substructure will 
increase the scatter about the mean Y Q — Y relation. This 
latter effect is not large, at least for the N-body simula- 
tions used here, as can be seen by comparing the scatter 
m Figures [H and □ 

We emphasize, however, that the quantitative effects 
on photometry depend on the intrinsic cluster profile, and 
hence are subject to modeling uncertainty. The simula- 
tions used here do not include gas physics and simply as- 
sume that the gas follows the dark matter. The real bias 
will depend on unknown cluster physics, thus adding to 
the modeling uncertainty in the Y — M relation. This un- 
certainty, due to both cluster physics and the photometric 
uncertainty discussed here, must be dealt with by empir- 
ically calibrating the relation, either with external data 
(lensing) and/or internally (self-calibration). 
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Fig. 13. Integrated cluster counts for the three types of 
survey. The upper curve in each pair reproduces the re- 
sults of Figure 01 while the lower curve shows the effect of 
point source confusion. Despite the large IR point source 
population, multiband surveys efficiently eliminate confu- 
sion. The AMI-like survey is, on the other hand, strongly 
affected. This latter effect remains uncertain due to a lack 
of information on the faint end of the radio point source 
counts (see text). 



5.2. Point Sources 

We next examine the effect of point sources. In a previous 
paper IjBartlett fc Melin 20051 hereafter BM) we studied 
their influence on survey detection sensitivity. We extend 
this work to our present study in this section. 

Low frequency surveys, such as our AMI example, con- 
tend with an important radio source population, while 
higher frequency bolometer surveys face a large popula- 
tion of IR sources. Radio source counts down to the sub- 
mjy flux levels relevant for SZ surveys are unfortunately 
poorly known. The IR counts are somewhat better con- 
strained at fluxes dominating the fluctuations in the IR 
background, although at higher frequencies (850 microns) 
than those used in SZ surveys; an uncertain extrapolation 
in frequency is thus necessary. 

For the present study, we use the radio counts fit 
by IKnox et al. 20041 to a combination of data from CBI, 
DASI, VSA and WMAP (see also Eq. 6 in BM), and IR 
counts fit to blank-field SCUBA observations at 850 mi- 
crons by |Borys et al. 20 03 (and given by Eq. 8 in BM) . We 
further assume that all radio sources brighter than 100 /iJy 
have been subtracted from our maps at 15 GHz (AMI 
case) ; this is the target sensitivity of the long baseline Ryle 
Telescope observations that will perform the source sub- 
traction for AMI. No such explicit point source subtrac- 
tion is readily available for the higher frequency bolometer 
surveys; they must rely solely on their frequency coverage 
to reduce point source confusion. We therefore include all 
IR sources in our simulations, and fix their effective spec- 
tral index a = 3 with no dispersion 12 . We refer the reader 
to BM for details of our point source model. Note that 
for this study we use the spherical cluster model for direct 
comparison to our fiducial results. 

Figure El compares the integrated counts from 
Figure|21 (upper curve in each case) to those extracted from 
the simulations including point sources (lower curves). We 
see that point source confusion only slightly decreases the 
completeness of the multiband surveys, but greatly affects 
the single frequency survey. 

In the case of SPT, this is because point source con- 
fusion remains modest compared to the noise: the two are 
comparable at 150 GHz, but the noise power rises more 
quickly with frequency than the confusion power (see BM 
for details) - in other words, the noise is bluer than the 
confusion. This is an important consideration when look- 
ing for the optimal allocation of detectors to the observa- 
tion bands. 

For Planck, confusion power dominates at all frequen- 
cies, but the spectral coverage provides sufficient leverage 
to control it. In this light, it must be emphasized that we 
only include three astrophysical signals (SZ, CMB & point 
sources) in these simulations, so that three observation 
bands are sufficient. In reality, one will have to deal with 
other foregrounds, e.g., diffuse Galactic emission, which 
will require the use of additional observation bands. 



1 As discussed in BM, any dispersion has only a small effect 
on survey sensitivity 
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The single frequency observations, on the other hand, 
are strongly affected. This is consistent with the estimate 
in BM (Eq. 15) placing confusion noise well above instru- 
mental noise for the chosen point source model and source 
subtraction threshold. We emphasize the uncertainty in 
this estimate, however: in BM we showed, for example, 
that a model with flattening counts has much lower source 
confusion while remaining consistent with the observed 
counts at high flux densities. The actual confusion level 
remains to be determined from deeper counts at CMB fre- 
quencies (see IWaldram et al. 20031 IWaldram et al. 20041 
for recent deep counts at 15 GHz). 

Contamination in the multiband surveys is practically 
unaffected by point source confusion. For AMI we actu- 
ally find a lower contamination rate, an apparent gain ex- 
plained by the fact that the catalog now contains only the 
brighter SZ sources, due to the lowered sensitivity caused 
by point source confusion. 

The photometry of the multiband surveys also shows 
little effect from the point sources. Fits to the recovered 
flux vs. true flux relation do not differ significantly from 
the no-source case, and the dispersion remains essentially 
the same. This is consistent with the idea that point source 
confusion is cither modest compared to the noise (SPT) 
or controlled by multiband observations (Planck). 

6. Discussion and Conclusion 

We have described a simple, rapid method based on 
matched multi-frequency filters for extracting cluster cat- 
alogs from SZ surveys. We assessed its performance when 
applied to the three kinds of survey listed in Table The 
rapidity of the method allows us to run many simulations 
of each survey to accurately quantify selection effects and 
observational uncertainties. We specifically examined cat- 
alog completeness, contamination rate and photometric 
precision. 

Figure |21 shows the cluster selection criteria in terms 
of total SZ flux and source size. It clearly demonstrates 
that SZ surveys, in particular high resolution ground- 
bases surveys, will not be purely flux limited, something 
which must be correctly accounted for when interpreting 
catalog statistics (Melin et al. 2005). 

Figure and Table |2 summarize the expected yield 
for each survey. The counts roll off at the faint end well 
before the point source flux limit (intercept of the curves 
in Figure multiplied by the S/N limit) even at the high 
detection threshold of S/N=5; the surveys loose complete- 
ness precisely because they are not purely flux-limited. 
These yields depend on the underlying cluster model and 
are hence subject to non-negligible uncertainty. They are 
nonetheless indicative, and in this work we focus on the 
nature of observational selection effects for which the ex- 
act yields are of secondary importance. 

At our fiducial S/N=5 detection threshold, overall cat- 
alog contamination remains below 5%, with some depen- 
dence on SZ flux for the single frequency survey (see 
FigureEJ). The overall contamination rises to between 20% 



and 30% at S/N>3. We note that the contamination rate 
is always larger than expected from pure instrumental 
noise, pointing to the influence of astrophysical confusion. 

We pay particular attention to photometric precision, 
an issue often neglected in discussions of the scientific po- 
tential of SZ surveys. Scatter plots for the recovered flux 
for each survey type are given in Figures and In the 
two multiband surveys, the recovered SZ flux is slightly bi- 
ased, due to the flux cut, with a dispersion of <JiogY a = 0.17 
and <7iogY — 0.13 for SPT and Planck, respectively. This 
observational dispersion is significantly larger than the in- 
trinsic dispersion in the Y — M relation predicted by hy- 
drodynamical simulations. This uncertainty must be prop- 
erly accounted for in scientific interpretation of SZ cata- 
logs; specifically, it will degrade survey calibration and 
cosmological constraints. 

Even more importantly, we found that astrophysi- 
cal confusion seriously compromises the photometry of 
the single frequency survey (Figure EJ- The histogram in 
Figure shows that the recovered flux has in fact a bi- 
modal distribution. We traced the effect to an inability 
to determine source core radii in the presence of primary 
CMB anisotropy. If cluster core radius could be accurately 
measured, e.g., with X-ray follow-up, then we would ob- 
tain photometric precision comparable to the multiband 
surveys (see Figure lll|l . This confusion can also be re- 
moved by follow-up of detected sources at a second ra- 
dio frequency (e.g., 90 GHz). Photometric uncertainty will 
therefore be key limiting factor in single frequency SZ sur- 
veys. 

All these results apply to the ideal case where the filter 
exactly matches the (simulated) cluster profiles. We then 
examined the potential impact of cluster morphology and 
point sources on these conclusions. 

Using N-body simulations, we found that cluster mor- 
phology has little effect on catalog completeness, but that 
it does increase the contamination rate and bias the pho- 
tometry. The increased contamination is caused by de- 
viations from a smooth radial SZ profile that appear as 
residual flux in the maps after source extraction. More 
importantly, the photometry is biased by the mismatch 
between the filter template and the actual cluster profile. 
This observational bias adds to the modeling uncertainty 
in the Y — M relation, which will have to be empirically 
determined in order to use the catalog for cosmology stud- 
ies. 

As shown by Figure 1131 point sources decrease sur- 
vey completeness. The multiband surveys effectively re- 
duce IR point source confusion and suffer only a small de- 
crease. Radio source confusion, on the other hand, greatly 
decreased the completeness of the single frequency sur- 
vey. This is consistent with the expectation that, for our 
adopted radio point source model and source subtraction 
threshold, point source confusion dominates instrumental 
noise. Modeling uncertainty here is, however, very large: 
radio source counts are not constrained at relevant fluxes 
(~ 100 /iJy), which requires us to extrapolate counts from 
mJy levels (see BM for a more detailed discussion). 
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Surveys based on the SZ effect will open a new win- 
dow onto the high redshift universe. They inherit their 
strong scientific potential from the unique characteristics 
of the SZ signal. Full realization of this potential, however, 
requires understanding of observational selection effects 
and uncertainties. Overall, multiband surveys appear ro- 
bust in this light, while single frequency surveys will most 
likely require additional observational effort, e.g., follow- 
up in other wavebands, to overcome large photometric er- 
rors caused by astrophysical confusion with primary CMB 
anisotropy. 
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